This data set contains 113,937 loans with 81 variables on each loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, borrower employment status, borrower credit history, and the latest payment information.

I truly wanted to understand in my analysis which party benefits most in which situations from prosper loans: Prosper, Investor or Borrower? How easy is it for an investor to loose all there money on a prosper loan? What loans are the most likely to make an investor a large return? I also want to understand how specific credit ratings and scores relate to borrowers characteristics. I hope I can provide you with some interesting insights into the world of prosper loans by the end of this analysis.

Analysis and Exploration of the Data

Univariate Plots

Size of Data set and Data Types

## 'data.frame':    113937 obs. of  81 variables:
##  $ ListingKey                         : Factor w/ 113066 levels "00003546482094282EF90E5",..: 7180 7193 6647 6669 6686 6689 6699 6706 6687 6687 ...
##  $ ListingNumber                      : int  193129 1209647 81716 658116 909464 1074836 750899 768193 1023355 1023355 ...
##  $ ListingCreationDate                : Factor w/ 113064 levels "2005-11-09 20:44:28.847000000",..: 14184 111894 6429 64760 85967 100310 72556 74019 97834 97834 ...
##  $ CreditGrade                        : Factor w/ 8 levels "A","AA","B","C",..: 4 NA 7 NA NA NA NA NA NA NA ...
##  $ Term                               : int  36 36 36 36 36 60 36 36 36 36 ...
##  $ LoanStatus                         : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
##  $ ClosedDate                         : Factor w/ 2802 levels "2005-11-25 00:00:00",..: 1137 NA 1262 NA NA NA NA NA NA NA ...
##  $ BorrowerAPR                        : num  0.165 0.12 0.283 0.125 0.246 ...
##  $ BorrowerRate                       : num  0.158 0.092 0.275 0.0974 0.2085 ...
##  $ LenderYield                        : num  0.138 0.082 0.24 0.0874 0.1985 ...
##  $ EstimatedEffectiveYield            : num  NA 0.0796 NA 0.0849 0.1832 ...
##  $ EstimatedLoss                      : num  NA 0.0249 NA 0.0249 0.0925 ...
##  $ EstimatedReturn                    : num  NA 0.0547 NA 0.06 0.0907 ...
##  $ ProsperRating.numeric              : int  NA 6 NA 6 3 5 2 4 7 7 ...
##  $ ProsperRating.Alpha                : Factor w/ 7 levels "A","AA","B","C",..: NA 1 NA 1 5 3 6 4 2 2 ...
##  $ ProsperScore                       : num  NA 7 NA 9 4 10 2 4 9 11 ...
##  $ ListingCategory.numeric            : int  0 2 0 16 2 1 1 2 7 7 ...
##  $ BorrowerState                      : Factor w/ 51 levels "AK","AL","AR",..: 6 6 11 11 24 33 17 5 15 15 ...
##  $ Occupation                         : Factor w/ 67 levels "Accountant/CPA",..: 36 42 36 51 20 42 49 28 23 23 ...
##  $ EmploymentStatus                   : Factor w/ 8 levels "Employed","Full-time",..: 8 1 3 1 1 1 1 1 1 1 ...
##  $ EmploymentStatusDuration           : int  2 44 NA 113 44 82 172 103 269 269 ...
##  $ IsBorrowerHomeowner                : Factor w/ 2 levels "False","True": 2 1 1 2 2 2 1 1 2 2 ...
##  $ CurrentlyInGroup                   : Factor w/ 2 levels "False","True": 2 1 2 1 1 1 1 1 1 1 ...
##  $ GroupKey                           : Factor w/ 706 levels "00343376901312423168731",..: NA NA 334 NA NA NA NA NA NA NA ...
##  $ DateCreditPulled                   : Factor w/ 112992 levels "2005-11-09 00:30:04.487000000",..: 14347 111883 6446 64724 85857 100382 72500 73937 97888 97888 ...
##  $ CreditScoreRangeLower              : int  640 680 480 800 680 740 680 700 820 820 ...
##  $ CreditScoreRangeUpper              : int  659 699 499 819 699 759 699 719 839 839 ...
##  $ FirstRecordedCreditLine            : Factor w/ 11585 levels "1947-08-24 00:00:00",..: 8638 6616 8926 2246 9497 496 8264 7684 5542 5542 ...
##  $ CurrentCreditLines                 : int  5 14 NA 5 19 21 10 6 17 17 ...
##  $ OpenCreditLines                    : int  4 14 NA 5 19 17 7 6 16 16 ...
##  $ TotalCreditLinespast7years         : int  12 29 3 29 49 49 20 10 32 32 ...
##  $ OpenRevolvingAccounts              : int  1 13 0 7 6 13 6 5 12 12 ...
##  $ OpenRevolvingMonthlyPayment        : num  24 389 0 115 220 1410 214 101 219 219 ...
##  $ InquiriesLast6Months               : int  3 3 0 0 1 0 0 3 1 1 ...
##  $ TotalInquiries                     : num  3 5 1 1 9 2 0 16 6 6 ...
##  $ CurrentDelinquencies               : int  2 0 1 4 0 0 0 0 0 0 ...
##  $ AmountDelinquent                   : num  472 0 NA 10056 0 ...
##  $ DelinquenciesLast7Years            : int  4 0 0 14 0 0 0 0 0 0 ...
##  $ PublicRecordsLast10Years           : int  0 1 0 0 0 0 0 1 0 0 ...
##  $ PublicRecordsLast12Months          : int  0 0 NA 0 0 0 0 0 0 0 ...
##  $ RevolvingCreditBalance             : num  0 3989 NA 1444 6193 ...
##  $ BankcardUtilization                : num  0 0.21 NA 0.04 0.81 0.39 0.72 0.13 0.11 0.11 ...
##  $ AvailableBankcardCredit            : num  1500 10266 NA 30754 695 ...
##  $ TotalTrades                        : num  11 29 NA 26 39 47 16 10 29 29 ...
##  $ TradesNeverDelinquent.percentage   : num  0.81 1 NA 0.76 0.95 1 0.68 0.8 1 1 ...
##  $ TradesOpenedLast6Months            : num  0 2 NA 0 2 0 0 0 1 1 ...
##  $ DebtToIncomeRatio                  : num  0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...
##  $ IncomeRange                        : Factor w/ 8 levels "$0","$1-24,999",..: 4 5 7 4 3 3 4 4 4 4 ...
##  $ IncomeVerifiable                   : Factor w/ 2 levels "False","True": 2 2 2 2 2 2 2 2 2 2 ...
##  $ StatedMonthlyIncome                : num  3083 6125 2083 2875 9583 ...
##  $ LoanKey                            : Factor w/ 113066 levels "00003683605746079487FF7",..: 100337 69837 46303 70776 71387 86505 91250 5425 908 908 ...
##  $ TotalProsperLoans                  : int  NA NA NA NA 1 NA NA NA NA NA ...
##  $ TotalProsperPaymentsBilled         : int  NA NA NA NA 11 NA NA NA NA NA ...
##  $ OnTimeProsperPayments              : int  NA NA NA NA 11 NA NA NA NA NA ...
##  $ ProsperPaymentsLessThanOneMonthLate: int  NA NA NA NA 0 NA NA NA NA NA ...
##  $ ProsperPaymentsOneMonthPlusLate    : int  NA NA NA NA 0 NA NA NA NA NA ...
##  $ ProsperPrincipalBorrowed           : num  NA NA NA NA 11000 NA NA NA NA NA ...
##  $ ProsperPrincipalOutstanding        : num  NA NA NA NA 9948 ...
##  $ ScorexChangeAtTimeOfListing        : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ LoanCurrentDaysDelinquent          : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ LoanFirstDefaultedCycleNumber      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ LoanMonthsSinceOrigination         : int  78 0 86 16 6 3 11 10 3 3 ...
##  $ LoanNumber                         : int  19141 134815 6466 77296 102670 123257 88353 90051 121268 121268 ...
##  $ LoanOriginalAmount                 : int  9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
##  $ LoanOriginationDate                : Factor w/ 1873 levels "2005-11-15 00:00:00",..: 426 1866 260 1535 1757 1821 1649 1666 1813 1813 ...
##  $ LoanOriginationQuarter             : Factor w/ 33 levels "Q1 2006","Q1 2007",..: 18 8 2 32 24 33 16 16 33 33 ...
##  $ MemberKey                          : Factor w/ 90831 levels "00003397697413387CAF966",..: 11071 10302 33781 54939 19465 48037 60448 40951 26129 26129 ...
##  $ MonthlyLoanPayment                 : num  330 319 123 321 564 ...
##  $ LP_CustomerPayments                : num  11396 0 4187 5143 2820 ...
##  $ LP_CustomerPrincipalPayments       : num  9425 0 3001 4091 1563 ...
##  $ LP_InterestandFees                 : num  1971 0 1186 1052 1257 ...
##  $ LP_ServiceFees                     : num  -133.2 0 -24.2 -108 -60.3 ...
##  $ LP_CollectionFees                  : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_GrossPrincipalLoss              : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_NetPrincipalLoss                : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_NonPrincipalRecoverypayments    : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ PercentFunded                      : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ Recommendations                    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ InvestmentFromFriendsCount         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ InvestmentFromFriendsAmount        : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Investors                          : int  258 1 41 158 20 1 1 1 1 1 ...

Summary Statistics of Data

Table continues below
ListingKey ListingNumber ListingCreationDate
17A93590655669644DB4C06: 6 Min. : 4 2013-10-02 17:20:16.550000000: 6
349D3587495831350F0F648: 4 1st Qu.: 400919 2013-08-28 20:31:41.107000000: 4
47C1359638497431975670B: 4 Median : 600554 2013-09-08 09:27:44.853000000: 4
8474358854651984137201C: 4 Mean : 627886 2013-12-06 05:43:13.830000000: 4
DE8535960513435199406CE: 4 3rd Qu.: 892634 2013-12-06 11:44:58.283000000: 4
04C13599434217079754AEE: 3 Max. :1255725 2013-08-21 07:25:22.360000000: 3
(Other) :113912 NA (Other) :113912
Table continues below
CreditGrade Term LoanStatus
C : 5649 Min. :12.00 Current :56576
D : 5153 1st Qu.:36.00 Completed :38074
B : 4389 Median :36.00 Chargedoff :11992
AA : 3509 Mean :40.83 Defaulted : 5018
HR : 3508 3rd Qu.:36.00 Past Due (1-15 days) : 806
(Other): 6745 Max. :60.00 Past Due (31-60 days): 363
NA’s :84984 NA (Other) : 1108
Table continues below
ClosedDate BorrowerAPR BorrowerRate LenderYield
2014-03-04 00:00:00: 105 Min. :0.00653 Min. :0.0000 Min. :-0.0100
2014-02-19 00:00:00: 100 1st Qu.:0.15629 1st Qu.:0.1340 1st Qu.: 0.1242
2014-02-11 00:00:00: 92 Median :0.20976 Median :0.1840 Median : 0.1730
2012-10-30 00:00:00: 81 Mean :0.21883 Mean :0.1928 Mean : 0.1827
2013-02-26 00:00:00: 78 3rd Qu.:0.28381 3rd Qu.:0.2500 3rd Qu.: 0.2400
(Other) :54633 Max. :0.51229 Max. :0.4975 Max. : 0.4925
NA’s :58848 NA’s :25 NA NA
Table continues below
EstimatedEffectiveYield EstimatedLoss EstimatedReturn
Min. :-0.183 Min. :0.005 Min. :-0.183
1st Qu.: 0.116 1st Qu.:0.042 1st Qu.: 0.074
Median : 0.162 Median :0.072 Median : 0.092
Mean : 0.169 Mean :0.080 Mean : 0.096
3rd Qu.: 0.224 3rd Qu.:0.112 3rd Qu.: 0.117
Max. : 0.320 Max. :0.366 Max. : 0.284
NA’s :29084 NA’s :29084 NA’s :29084
Table continues below
ProsperRating.numeric ProsperRating.Alpha ProsperScore
Min. :1.000 C :18345 Min. : 1.00
1st Qu.:3.000 B :15581 1st Qu.: 4.00
Median :4.000 A :14551 Median : 6.00
Mean :4.072 D :14274 Mean : 5.95
3rd Qu.:5.000 E : 9795 3rd Qu.: 8.00
Max. :7.000 (Other):12307 Max. :11.00
NA’s :29084 NA’s :29084 NA’s :29084
Table continues below
ListingCategory.numeric BorrowerState Occupation
Min. : 0.000 CA :14717 Other :28617
1st Qu.: 1.000 TX : 6842 Professional :13628
Median : 1.000 NY : 6729 Computer Programmer: 4478
Mean : 2.774 FL : 6720 Executive : 4311
3rd Qu.: 3.000 IL : 5921 Teacher : 3759
Max. :20.000 (Other):67493 (Other) :55556
NA NA’s : 5515 NA’s : 3588
Table continues below
EmploymentStatus EmploymentStatusDuration IsBorrowerHomeowner
Employed :67322 Min. : 0.00 False:56459
Full-time :26355 1st Qu.: 26.00 True :57478
Self-employed: 6134 Median : 67.00 NA
Not available: 5347 Mean : 96.07 NA
Other : 3806 3rd Qu.:137.00 NA
(Other) : 2718 Max. :755.00 NA
NA’s : 2255 NA’s :7625 NA
Table continues below
CurrentlyInGroup GroupKey DateCreditPulled
False:101218 783C3371218786870A73D20: 1140 2013-12-23 09:38:12: 6
True : 12719 3D4D3366260257624AB272D: 916 2013-11-21 09:09:41: 4
NA 6A3B336601725506917317E: 698 2013-12-06 05:43:16: 4
NA FEF83377364176536637E50: 611 2014-01-14 20:17:49: 4
NA C9643379247860156A00EC0: 342 2014-02-09 12:14:41: 4
NA (Other) : 9634 2013-09-27 22:04:54: 3
NA NA’s :100596 (Other) :113912
Table continues below
CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
Min. : 0.0 Min. : 19.0 1993-12-01 00:00:00: 185
1st Qu.:660.0 1st Qu.:679.0 1994-11-01 00:00:00: 178
Median :680.0 Median :699.0 1995-11-01 00:00:00: 168
Mean :685.6 Mean :704.6 1990-04-01 00:00:00: 161
3rd Qu.:720.0 3rd Qu.:739.0 1995-03-01 00:00:00: 159
Max. :880.0 Max. :899.0 (Other) :112389
NA’s :591 NA’s :591 NA’s : 697
Table continues below
CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
Min. : 0.00 Min. : 0.00 Min. : 2.00
1st Qu.: 7.00 1st Qu.: 6.00 1st Qu.: 17.00
Median :10.00 Median : 9.00 Median : 25.00
Mean :10.32 Mean : 9.26 Mean : 26.75
3rd Qu.:13.00 3rd Qu.:12.00 3rd Qu.: 35.00
Max. :59.00 Max. :54.00 Max. :136.00
NA’s :7604 NA’s :7604 NA’s :697
Table continues below
OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
Min. : 0.00 Min. : 0.0 Min. : 0.000
1st Qu.: 4.00 1st Qu.: 114.0 1st Qu.: 0.000
Median : 6.00 Median : 271.0 Median : 1.000
Mean : 6.97 Mean : 398.3 Mean : 1.435
3rd Qu.: 9.00 3rd Qu.: 525.0 3rd Qu.: 2.000
Max. :51.00 Max. :14985.0 Max. :105.000
NA NA NA’s :697
Table continues below
TotalInquiries CurrentDelinquencies AmountDelinquent
Min. : 0.000 Min. : 0.0000 Min. : 0.0
1st Qu.: 2.000 1st Qu.: 0.0000 1st Qu.: 0.0
Median : 4.000 Median : 0.0000 Median : 0.0
Mean : 5.584 Mean : 0.5921 Mean : 984.5
3rd Qu.: 7.000 3rd Qu.: 0.0000 3rd Qu.: 0.0
Max. :379.000 Max. :83.0000 Max. :463881.0
NA’s :1159 NA’s :697 NA’s :7622
Table continues below
DelinquenciesLast7Years PublicRecordsLast10Years PublicRecordsLast12Months
Min. : 0.000 Min. : 0.0000 Min. : 0.000
1st Qu.: 0.000 1st Qu.: 0.0000 1st Qu.: 0.000
Median : 0.000 Median : 0.0000 Median : 0.000
Mean : 4.155 Mean : 0.3126 Mean : 0.015
3rd Qu.: 3.000 3rd Qu.: 0.0000 3rd Qu.: 0.000
Max. :99.000 Max. :38.0000 Max. :20.000
NA’s :990 NA’s :697 NA’s :7604
Table continues below
RevolvingCreditBalance BankcardUtilization AvailableBankcardCredit
Min. : 0 Min. :0.000 Min. : 0
1st Qu.: 3121 1st Qu.:0.310 1st Qu.: 880
Median : 8549 Median :0.600 Median : 4100
Mean : 17599 Mean :0.561 Mean : 11210
3rd Qu.: 19521 3rd Qu.:0.840 3rd Qu.: 13180
Max. :1435667 Max. :5.950 Max. :646285
NA’s :7604 NA’s :7604 NA’s :7544
Table continues below
TotalTrades TradesNeverDelinquent.percentage TradesOpenedLast6Months
Min. : 0.00 Min. :0.000 Min. : 0.000
1st Qu.: 15.00 1st Qu.:0.820 1st Qu.: 0.000
Median : 22.00 Median :0.940 Median : 0.000
Mean : 23.23 Mean :0.886 Mean : 0.802
3rd Qu.: 30.00 3rd Qu.:1.000 3rd Qu.: 1.000
Max. :126.00 Max. :1.000 Max. :20.000
NA’s :7544 NA’s :7544 NA’s :7544
Table continues below
DebtToIncomeRatio IncomeRange IncomeVerifiable
Min. : 0.000 $25,000-49,999:32192 False: 8669
1st Qu.: 0.140 $50,000-74,999:31050 True :105268
Median : 0.220 $100,000+ :17337 NA
Mean : 0.276 $75,000-99,999:16916 NA
3rd Qu.: 0.320 Not displayed : 7741 NA
Max. :10.010 $1-24,999 : 7274 NA
NA’s :8554 (Other) : 1427 NA
Table continues below
StatedMonthlyIncome LoanKey TotalProsperLoans
Min. : 0 CB1B37030986463208432A1: 6 Min. :0.00
1st Qu.: 3200 2DEE3698211017519D7333F: 4 1st Qu.:1.00
Median : 4667 9F4B37043517554537C364C: 4 Median :1.00
Mean : 5608 D895370150591392337ED6D: 4 Mean :1.42
3rd Qu.: 6825 E6FB37073953690388BC56D: 4 3rd Qu.:2.00
Max. :1750003 0D8F37036734373301ED419: 3 Max. :8.00
NA (Other) :113912 NA’s :91852
Table continues below
TotalProsperPaymentsBilled OnTimeProsperPayments
Min. : 0.00 Min. : 0.00
1st Qu.: 9.00 1st Qu.: 9.00
Median : 16.00 Median : 15.00
Mean : 22.93 Mean : 22.27
3rd Qu.: 33.00 3rd Qu.: 32.00
Max. :141.00 Max. :141.00
NA’s :91852 NA’s :91852
Table continues below
ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
Min. : 0.00 Min. : 0.00
1st Qu.: 0.00 1st Qu.: 0.00
Median : 0.00 Median : 0.00
Mean : 0.61 Mean : 0.05
3rd Qu.: 0.00 3rd Qu.: 0.00
Max. :42.00 Max. :21.00
NA’s :91852 NA’s :91852
Table continues below
ProsperPrincipalBorrowed ProsperPrincipalOutstanding
Min. : 0 Min. : 0
1st Qu.: 3500 1st Qu.: 0
Median : 6000 Median : 1627
Mean : 8472 Mean : 2930
3rd Qu.:11000 3rd Qu.: 4127
Max. :72499 Max. :23451
NA’s :91852 NA’s :91852
Table continues below
ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
Min. :-209.00 Min. : 0.0
1st Qu.: -35.00 1st Qu.: 0.0
Median : -3.00 Median : 0.0
Mean : -3.22 Mean : 152.8
3rd Qu.: 25.00 3rd Qu.: 0.0
Max. : 286.00 Max. :2704.0
NA’s :95009 NA
Table continues below
LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
Min. : 0.00 Min. : 0.0 Min. : 1
1st Qu.: 9.00 1st Qu.: 6.0 1st Qu.: 37332
Median :14.00 Median : 21.0 Median : 68599
Mean :16.27 Mean : 31.9 Mean : 69444
3rd Qu.:22.00 3rd Qu.: 65.0 3rd Qu.:101901
Max. :44.00 Max. :100.0 Max. :136486
NA’s :96985 NA NA
Table continues below
LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
Min. : 1000 2014-01-22 00:00:00: 491 Q4 2013:14450
1st Qu.: 4000 2013-11-13 00:00:00: 490 Q1 2014:12172
Median : 6500 2014-02-19 00:00:00: 439 Q3 2013: 9180
Mean : 8337 2013-10-16 00:00:00: 434 Q2 2013: 7099
3rd Qu.:12000 2014-01-28 00:00:00: 339 Q3 2012: 5632
Max. :35000 2013-09-24 00:00:00: 316 Q2 2012: 5061
NA (Other) :111428 (Other):60343
Table continues below
MemberKey MonthlyLoanPayment LP_CustomerPayments
63CA34120866140639431C9: 9 Min. : 0.0 Min. : -2.35
16083364744933457E57FB9: 8 1st Qu.: 131.6 1st Qu.: 1005.76
3A2F3380477699707C81385: 8 Median : 217.7 Median : 2583.83
4D9C3403302047712AD0CDD: 8 Mean : 272.5 Mean : 4183.08
739C338135235294782AE75: 8 3rd Qu.: 371.6 3rd Qu.: 5548.40
7E1733653050264822FAA3D: 8 Max. :2251.5 Max. :40702.39
(Other) :113888 NA NA
Table continues below
LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
Min. : 0.0 Min. : -2.35 Min. :-664.87
1st Qu.: 500.9 1st Qu.: 274.87 1st Qu.: -73.18
Median : 1587.5 Median : 700.84 Median : -34.44
Mean : 3105.5 Mean : 1077.54 Mean : -54.73
3rd Qu.: 4000.0 3rd Qu.: 1458.54 3rd Qu.: -13.92
Max. :35000.0 Max. :15617.03 Max. : 32.06
NA NA NA
Table continues below
LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
Min. :-9274.75 Min. : -94.2 Min. : -954.5
1st Qu.: 0.00 1st Qu.: 0.0 1st Qu.: 0.0
Median : 0.00 Median : 0.0 Median : 0.0
Mean : -14.24 Mean : 700.4 Mean : 681.4
3rd Qu.: 0.00 3rd Qu.: 0.0 3rd Qu.: 0.0
Max. : 0.00 Max. :25000.0 Max. :25000.0
NA NA NA
Table continues below
LP_NonPrincipalRecoverypayments PercentFunded Recommendations
Min. : 0.00 Min. :0.7000 Min. : 0.00000
1st Qu.: 0.00 1st Qu.:1.0000 1st Qu.: 0.00000
Median : 0.00 Median :1.0000 Median : 0.00000
Mean : 25.14 Mean :0.9986 Mean : 0.04803
3rd Qu.: 0.00 3rd Qu.:1.0000 3rd Qu.: 0.00000
Max. :21117.90 Max. :1.0125 Max. :39.00000
NA NA NA
InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
Min. : 0.00000 Min. : 0.00 Min. : 1.00
1st Qu.: 0.00000 1st Qu.: 0.00 1st Qu.: 2.00
Median : 0.00000 Median : 0.00 Median : 44.00
Mean : 0.02346 Mean : 16.55 Mean : 80.48
3rd Qu.: 0.00000 3rd Qu.: 0.00 3rd Qu.: 115.00
Max. :33.00000 Max. :25000.00 Max. :1189.00
NA NA NA

Credit Grades before 2009 are missing for many loans but for those that do exist a C credit rating is most prevalent. I wonder why C credit grades borrowers are granted the most loans.

Lower Credit Scores are mostly between 360 and 880. Some people have 0 for their lower credit score since if you have a credit score lower than 350 it is automatically set to zero. If you remove these 0 credit scores you are able to see weird areas around credit score 600-620 and 780-800 that do not have any scores. I do not understand these empty spaces but overall the distribution is normal.

There are a few outlier upper credit scores of 19 but if these are removed the overall plot is normal with small holes around around credit score 600-620 and 780-800 just like in the lower credit scores.

36 months is the most prevalent loan term length. I wonder why a 36 month loan is most prevalent.

While there are many defaulted or charged off loans, most loans are current or completed.

Estimated Returns seems to have normal distribution centered around .1 with some outliers that go into the negative returns.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  -0.183   0.074   0.092   0.096   0.117   0.284   29084

Most borrowers for prosper loans reside in California. It seems that the states with the largest cities seem the have the largest density of borrowers.

Most borrowers have a debt to income around .22 but the distribution is positively skewed with many large outliers. For example, some have a D2I ratio of 10.01:1. I wonder which type of borrower takes on such a large debt to income ratio?

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.1400  0.2200  0.2759  0.3200 10.0100

Most prosper borrowers have an income in the range of $25,000-$49,999.

Most prosper borrowers are making around $4000 a month which falls right in line with a yearly income of about $48,000. At the same time one borrower had a stated monthly income of $1,750,000. Could this outlier be an error since this person would potentially be making over $21,000,000 a year need a loan for $4000?

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0    3200    4667    5608    6825 1750003
##   LoanOriginalAmount
## 1               4000

There seems to be spikes in specific loan amounts which may stem from the way prosper creates loans. Also the distribution for prosper loans seems to be positively skewed with a media of $6500 and a maximum of $35000.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    4000    6500    8337   12000   35000

Closed Dates do not seem to have any outliers but it has a large dip in the amount of closed loans in 2012. I wonder why that is.

##         Min.      1st Qu.       Median         Mean      3rd Qu. 
## "2005-11-25" "2009-07-14" "2011-04-05" "2011-03-07" "2013-01-30" 
##         Max. 
## "2014-03-10"

Borrower APR does not have many outliers with a median APR of .2098. The distribution seems to be normal with limited spikes at .3 and .38.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00653 0.15629 0.20976 0.21883 0.28381 0.51229

Most borrowers are listed under debt consolidation.

## 
##     0     1     2     3     4     5     6     7     8     9    10    11 
## 16965 58308  7433  7189  2395   756  2572 10494   199    85    91   217 
##    12    13    14    15    16    17    18    19    20 
##    59  1996   876  1522   304    52   885   768   771

Almost all borrowers have no recommendations when they get a loan from prosper loans which is staggering. Even when you remove all those with no recommendations you still have a majority with only 1 recommendation. It seems most people who get loans on prosper loans do not have many people who think highly of them or having recommendations is not very important to lenders.

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##  0.00000  0.00000  0.00000  0.04803  0.00000 39.00000
## 
##      0      1      2      3      4      5      6      7      8      9 
## 109678   3516    568    108     26     14      4      5      3      6 
##     14     16     18     19     21     24     39 
##      1      2      2      1      1      1      1

The current credit lines distribution is very smooth with a slight positive skew since there are quiet a few large upper outliers. The median is 10 current credit lines.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    7.00   10.00   10.32   13.00   59.00

Almost all borrowers are full-time or employed. It seems having a steady job is very important to prove to lenders that you will pay off your loans.

##      Employed     Full-time Not available  Not employed         Other 
##         67322         26355          5347           835          3806 
##     Part-time       Retired Self-employed 
##          1088           795          6134

The investment from friends seems to be very positively skewed. The distribution is easier to see if it is placed on a a log scale.

Bivariate Plots

Leading up to the 2008 financial crisis there seemed to be a large increase in defaulted loans. Those loans seemed to convert over to charged off loans after 2008 and these defaulted loans then increased in number up to around 2009 where they slowly decreased in amount as completed loans greatly increased on number. It seems the drastic decrease in completed loans lead to the large overall decrease in number of closed loans in 2012. After this there is an increase in charged of and completed loans up to the present.

Most of the largely correlated items in the correlation matrix are interrelated and do not shed large insights into relationships in the data alone.

Those with a monthly income less than about $8334 seem unable to get a loan larger than $25000. I wonder if prosper loans have made a maximum loan amount $25,000 for lower income borrowers?

##    StatedMonthlyIncome LoanOriginalAmount
## 1             8333.333              35000
## 2             8333.333              30000
## 3             8333.333              35000
## 4             8333.333              30000
## 5             8333.333              35000
## 6             8333.333              35000
## 7             8333.333              35000
## 8             8333.333              35000
## 9             8333.333              30000
## 10            8333.333              30000
## 11            8333.333              30000
## 12            8333.333              35000
## 13            8333.333              30000
## 14            8333.333              28000
## 15            8333.333              28000
## 16            8333.333              35000
## 17            8333.333              35000
## 18            8333.333              30000
## 19            8333.333              30000
## 20            8333.333              30000
## 21            8333.333              35000
## 22            8333.333              34700
## 23            8333.333              35000
## 24            8333.333              32500
## 25            8333.333              35000
## 26            8333.333              35000
## 27            8333.333              35000
## 28            8333.333              32500
## 29            8333.333              30000
## 30            8333.333              35000
## 31            8333.333              30000
## 32            8333.333              30000
## 33            8333.333              28000
## 34            8333.333              35000
## 35            8333.333              28500
## 36            8333.333              35000
## 37            8333.333              30000
## 38            8333.333              30000

If you look at how much money is estimated to be made on average from a loan, borrowers with a C ratings tend to give the largest returns.

## $AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    14.6   288.8   579.1   639.2   853.5  2830.8 
## 
## $A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   25.52  382.19  708.00  807.56 1090.50 3055.00 
## 
## $B
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    -8.0   516.6   894.9   998.8  1357.5  3570.0 
## 
## $C
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   -60.0   509.9   927.4  1006.8  1338.9  4117.5 
## 
## $D
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##   -5.168  420.400  756.290  832.883 1146.156 3096.000 
## 
## $E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -21.42  404.08  480.90  567.58  704.41 2930.37 
## 
## $HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -1656.0   311.5   453.9   395.0   498.4  2028.5 
## 
## $NC
## NULL

It seems the highest estimated return in-terms of percent increases as loans get higher risk with worst borrower credit rating, which makes sense since APR increases as well. This can be seen with a F value of 13792 with an extremely low p value between all pairs from an analysis of variance POST HOC test.

## $AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01460 0.04554 0.05100 0.05399 0.05540 0.19360 
## 
## $A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01780 0.06081 0.06663 0.06965 0.07284 0.18310 
## 
## $B
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.00100  0.07408  0.08215  0.08629  0.09260  0.28370 
## 
## $C
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.00910  0.08227  0.09220  0.09810  0.11050  0.26670 
## 
## $D
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.0045  0.1012  0.1163  0.1187  0.1414  0.2332 
## 
## $E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.0124  0.1054  0.1239  0.1247  0.1487  0.1843 
## 
## $HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.1827  0.1135  0.1221  0.1136  0.1246  0.1399 
## 
## $NC
## NULL
##                    Df Sum Sq Mean Sq F value Pr(>F)    
## AllCreditGrades     6  38.73   6.454   13792 <2e-16 ***
## Residuals       84846  39.71   0.000                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 29084 observations deleted due to missingness
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = EstimatedReturn ~ AllCreditGrades, data = Loans)
## 
## $AllCreditGrades
##               diff          lwr          upr p adj
## A-AA   0.015656356  0.014638105  0.016674606     0
## B-AA   0.032301445  0.031292310  0.033310580     0
## C-AA   0.044112993  0.043123541  0.045102445     0
## D-AA   0.064737053  0.063716141  0.065757964     0
## E-AA   0.070700639  0.069617781  0.071783496     0
## HR-AA  0.059635718  0.058476470  0.060794967     0
## B-A    0.016645089  0.015909794  0.017380385     0
## C-A    0.028456637  0.027748597  0.029164678     0
## D-A    0.049080697  0.048329321  0.049832073     0
## E-A    0.055044283  0.054210684  0.055877882     0
## HR-A   0.043979363  0.043048684  0.044910042     0
## C-B    0.011811548  0.011116681  0.012506415     0
## D-B    0.032435608  0.031696633  0.033174583     0
## E-B    0.038399194  0.037576755  0.039221632     0
## HR-B   0.027334273  0.026413577  0.028254970     0
## D-C    0.020624060  0.019912198  0.021335921     0
## E-C    0.026587646  0.025789480  0.027385811     0
## HR-C   0.015522725  0.014623645  0.016421805     0
## E-D    0.005963586  0.005126739  0.006800432     0
## HR-D  -0.005101334 -0.006034924 -0.004167745     0
## HR-E  -0.011064920 -0.012065875 -0.010063966     0

Of loans that are completed and not charged off or defaulted, the most return on average actually comes from D Credit-grade grade loans with a mean value of $1293.00.

## $AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -7221.4   137.3   429.3   871.6  1103.5 13227.2 
## 
## $A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -8444.1   255.9   640.0  1029.8  1337.9 15702.4 
## 
## $B
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -13566.3    390.0    914.7   1312.0   1794.0  10689.1 
## 
## $C
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -3000.0   362.9   844.2  1218.4  1605.6 11807.9 
## 
## $D
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -8818.9   467.7   974.4  1293.4  1777.2 13013.0 
## 
## $E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -2904.2   411.6   829.3  1108.5  1522.3 10803.2 
## 
## $HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -512.5   387.5   729.5   921.1  1270.4  9164.2 
## 
## $NC
## NULL

Seems Charged off loans truly began during the 2008 recession.

Seems that people who charge off or default on their loans have about the same median amount of current credit lines as those who complete their loans.

## $Cancelled
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       9       9       9       9       9       9 
## 
## $Chargedoff
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   5.000   8.000   8.846  12.000  48.000 
## 
## $Completed
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   6.000   9.000   9.692  13.000  59.000 
## 
## $Current
## NULL
## 
## $Defaulted
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    6.00   10.00   10.64   14.00   52.00 
## 
## $FinalPaymentInProgress
## NULL
## 
## $`Past Due (>120 days)`
## NULL
## 
## $`Past Due (1-15 days)`
## NULL
## 
## $`Past Due (16-30 days)`
## NULL
## 
## $`Past Due (31-60 days)`
## NULL
## 
## $`Past Due (61-90 days)`
## NULL
## 
## $`Past Due (91-120 days)`
## NULL

As you credit grade gets better your APR gets lower. This relationship is not do to random variance based since these two variables have a F value of 55574 and extremely low p values for all pairs in a POST HOC test. Another important observation is that even with a specific credit grade there are a wide range of APR possible. Even with an HR credit grade an borrower can get an loan APR around 0.00864.

## $AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01650 0.08325 0.09136 0.09641 0.10140 0.33172 
## 
## $A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01315 0.12449 0.13706 0.13831 0.15043 0.36623 
## 
## $B
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01325 0.16653 0.17754 0.17970 0.19501 0.37633 
## 
## $C
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00653 0.20040 0.22108 0.21844 0.24205 0.40243 
## 
## $D
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00653 0.24614 0.27467 0.26598 0.29510 0.41355 
## 
## $E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01657 0.30131 0.32436 0.31551 0.34621 0.41355 
## 
## $HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00864 0.30554 0.35643 0.32759 0.35797 0.51229 
## 
## $NC
## NULL
##                     Df Sum Sq Mean Sq F value Pr(>F)    
## AllCreditGrades      7  568.3   81.18   55574 <2e-16 ***
## Residuals       113773  166.2    0.00                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 156 observations deleted due to missingness
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = BorrowerAPR ~ AllCreditGrades, data = Loans)
## 
## $AllCreditGrades
##              diff          lwr         upr   p adj
## A-AA   0.04190859  0.040403734  0.04341344 0.0e+00
## B-AA   0.08329820  0.081819862  0.08477653 0.0e+00
## C-AA   0.12203080  0.120591084  0.12347051 0.0e+00
## D-AA   0.16957414  0.168089480  0.17105881 0.0e+00
## E-AA   0.21910684  0.217513370  0.22070031 0.0e+00
## HR-AA  0.23118069  0.229507762  0.23285362 0.0e+00
## NC-AA  0.13860918  0.128741704  0.14847666 0.0e+00
## B-A    0.04138961  0.040196623  0.04258259 0.0e+00
## C-A    0.08012221  0.078977432  0.08126699 0.0e+00
## D-A    0.12766556  0.126464737  0.12886637 0.0e+00
## E-A    0.17719825  0.175865254  0.17853125 0.0e+00
## HR-A   0.18927210  0.187845067  0.19069914 0.0e+00
## NC-A   0.09670060  0.086871817  0.10652937 0.0e+00
## C-B    0.03873260  0.037622914  0.03984229 0.0e+00
## D-B    0.08627595  0.085108533  0.08744336 0.0e+00
## E-B    0.13580864  0.134505657  0.13711163 0.0e+00
## HR-B   0.14788249  0.146483451  0.14928154 0.0e+00
## NC-B   0.05531099  0.045486234  0.06513574 0.0e+00
## D-C    0.04754334  0.046425239  0.04866145 0.0e+00
## E-C    0.09707604  0.095817042  0.09833504 0.0e+00
## HR-C   0.10914989  0.107791722  0.11050806 0.0e+00
## NC-C   0.01657838  0.006759368  0.02639740 8.6e-06
## E-D    0.04953270  0.048222535  0.05084286 0.0e+00
## HR-D   0.06160655  0.060200820  0.06301228 0.0e+00
## NC-D  -0.03096496 -0.040790667 -0.02113925 0.0e+00
## HR-E   0.01207385  0.010553657  0.01359405 0.0e+00
## NC-E  -0.08049766 -0.090340392 -0.07065492 0.0e+00
## NC-HR -0.09257151 -0.102427420 -0.08271560 0.0e+00

I found the median APR for all credit grades except AA increased after 2009.

CreditGrade median_before_2009 mean_before_2009 n.x median_after_2009 mean_after_2009 n.y
A 0.125520 0.1356993 3314 0.13799 0.1389094 14551
AA 0.096880 0.1061880 3495 0.09000 0.0900407 5372
B 0.156020 0.1643373 4387 0.18173 0.1840300 15581
C 0.180440 0.1934553 5646 0.22362 0.2261244 18345
D 0.213975 0.2255261 5152 0.28488 0.2805805 14274
E 0.269135 0.2707125 3288 0.33215 0.3305506 9795
HR 0.281790 0.2712610 3506 0.35797 0.3560612 6935
NA 0.219450 0.2265969 84984 0.18224 0.1959623 29059

If you look closer at the HR loans estimated return, they may have negative estimated returns in 2009 and 2010 but in normal economic times they are usually the highest estimated return.

## $`2009`
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.18270  0.00500  0.10200  0.06753  0.13990  0.13990 
## 
## $`2010`
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.17730  0.06355  0.12215  0.08788  0.13690  0.13990 
## 
## $`2011`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0887  0.1087  0.1148  0.1140  0.1246  0.1267 
## 
## $`2012`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1124  0.1221  0.1246  0.1228  0.1246  0.1271 
## 
## $`2013`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1043  0.1131  0.1135  0.1145  0.1185  0.1185 
## 
## $`2014`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.09975 0.10431 0.10431 0.10494 0.10663 0.10663

Charged Off and Defaulted loans tended to have a higher return estimate. Sadly, these loans most likely will not get paid off.

## $Cancelled
## NULL
## 
## $Chargedoff
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.1816  0.1108  0.1246  0.1234  0.1440  0.2837 
## 
## $Completed
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  -0.183   0.072   0.107   0.102   0.132   0.267   18410 
## 
## $Current
## NULL
## 
## $Defaulted
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  -0.046   0.109   0.127   0.123   0.144   0.254    4013 
## 
## $FinalPaymentInProgress
## NULL
## 
## $`Past Due (>120 days)`
## NULL
## 
## $`Past Due (1-15 days)`
## NULL
## 
## $`Past Due (16-30 days)`
## NULL
## 
## $`Past Due (31-60 days)`
## NULL
## 
## $`Past Due (61-90 days)`
## NULL
## 
## $`Past Due (91-120 days)`
## NULL

Seems Prosper facilitates mostly debt consolidation loans to c credit grades.

The C credit grades seems to be the most prevalent in most states.

Is seems that higher educated borrowers with higher paying jobs tend to have higher Credit Grades but still a lot of high paying jobs still have bad credit ratings.

Most borrowers are granted a loan if they are employed or full-time employeed.

It would seem higher the borrower’s credit grade, the larger their loan amount usually. On the other hand, the difference between NC and HR loan amount means seems to not to be significant based upon a post hoc variance test. It is also very interesting that the loan amount average is identical for AA, A and B credit ratings.

## $AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    4851   10000   10620   15000   35000 
## 
## $A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    5000   10000   11058   15000   35000 
## 
## $B
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    5000   10000   10895   15000   35000 
## 
## $C
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    4000    8500    9382   15000   25000 
## 
## $D
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    3200    5000    6474   10000   25000 
## 
## $E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    3000    4000    4286    5000   25000 
## 
## $HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    2000    3200    3124    4000   20000 
## 
## $NC
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    1000    2000    2316    3000   10000
##                     Df    Sum Sq   Mean Sq F value Pr(>F)    
## AllCreditGrades      7 9.064e+11 1.295e+11    4169 <2e-16 ***
## Residuals       113798 3.535e+12 3.106e+07                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 131 observations deleted due to missingness
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = LoanOriginalAmount ~ AllCreditGrades, data = Loans)
## 
## $AllCreditGrades
##             diff          lwr         upr     p adj
## A-AA    437.9988    218.68401   657.31367 0.0000000
## B-AA    275.3728     59.92821   490.81733 0.0027090
## C-AA  -1238.0426  -1447.85232 -1028.23281 0.0000000
## D-AA  -4145.7516  -4362.12101 -3929.38220 0.0000000
## E-AA  -6333.5255  -6565.76680 -6101.28419 0.0000000
## HR-AA -7495.6676  -7739.49364 -7251.84166 0.0000000
## NC-AA -8303.2314  -9737.02281 -6869.43995 0.0000000
## B-A    -162.6261   -336.57628    11.32415 0.0868471
## C-A   -1676.0414  -1842.96190 -1509.12090 0.0000000
## D-A   -4583.7504  -4758.84481 -4408.65607 0.0000000
## E-A   -6771.5243  -6965.89085 -6577.15782 0.0000000
## HR-A  -7933.6665  -8141.73723 -7725.59574 0.0000000
## NC-A  -8741.2302 -10169.37593 -7313.08450 0.0000000
## C-B   -1513.4153  -1675.21712 -1351.61355 0.0000000
## D-B   -4421.1244  -4591.34600 -4250.90275 0.0000000
## E-B   -6608.8983  -6798.88697 -6418.90957 0.0000000
## HR-B  -7771.0404  -7975.02767 -7567.05317 0.0000000
## NC-B  -8578.6041 -10006.16064 -7151.04766 0.0000000
## D-C   -2907.7090  -3070.74026 -2744.67782 0.0000000
## E-C   -5095.4829  -5279.05712 -4911.90875 0.0000000
## HR-C  -6257.6251  -6455.65178 -6059.59839 0.0000000
## NC-C  -7065.1888  -8491.90579 -5638.47184 0.0000000
## E-D   -2187.7739  -2378.81072 -1996.73707 0.0000000
## HR-D  -3349.9160  -3554.87985 -3144.95225 0.0000000
## NC-D  -4157.4798  -5585.17614 -2729.78341 0.0000000
## HR-E  -1162.1422  -1383.79608  -940.48822 0.0000000
## NC-E  -1969.7059  -3399.89370  -539.51806 0.0007845
## NC-HR  -807.5637  -2239.67836   624.55089 0.6813763

It is interesting that Credit Score for Credit Grade HR has two different spikes around 500 and 660 for the lower Credit Score Range. Credit Grade D also has two peaks around 560 and 660.

Homemakers are the most likely to have a DebtToIncomeRatio of over 10-1. Is this because it takes a large amount of capital to build a home before being able to sell it?

Although most people receive no recommendations for a loan, those with a lower DebttoIncome Range have a larger amount of outlier values with high numbers of recommendations.

When a investor loans money to a friend, although counter intuitive, it seems they were more likely to invest more with a friend with a very lower credit grade of HR.

Multivariate Plots

Charge-offs happened from people with a small to large number deliquesces in the last 7 years meaning number of delinquencies would not be the best statistic to tell if someone was not going to pay there loans.

There is more very low or negative return outliers for High Risk loans in Auto, Business, Debt Consolidation, and Home Improvement categories. Their is also a very large distribution of Student Use estimated returns.

Seems the higher you credit score the more money lost when you do not repay your loan.

As one looks closer at the Estimated Returns, one finds that borrowers with lower credit grades have the potential to give higher returns in normal prosperous economic times. On the other hand, in bad economic times they also have the potential to yield significantly lower to negative returns.

It is interesting that if you have over 35 lines of credit your credit score seems to be very volatile.

It makes sense as you get a better credit rating you have less delinquencies and a higher number of CurrentCreditLines.

It is interesting that the amount of money delinquent at its peek is so similar between credit rating A and HR.

It seems the number of recommendations a borrower received did not play a large role in the Estimated Return from their loan on average. At the same time it seems HR loans were a lot more likely to have negative returns.

Final Plots and Summary

I found the median APR for all credit grades except AA increased after 2009. It was especially interesting that the mean APR for a AA credit grade borrower after 2009 decreased by about 15% while the APR for an HR credit grade borrower increased by over 31%. Could such drastic changes have been caused by the huge credit crisis of the 2009 recession? This chart alone shows me the significance of having a great credit rating!

CreditGrade median_before_2009 mean_before_2009 n.x median_after_2009 mean_after_2009 n.y
A 0.125520 0.1356993 3314 0.13799 0.1389094 14551
AA 0.096880 0.1061880 3495 0.09000 0.0900407 5372
B 0.156020 0.1643373 4387 0.18173 0.1840300 15581
C 0.180440 0.1934553 5646 0.22362 0.2261244 18345
D 0.213975 0.2255261 5152 0.28488 0.2805805 14274
E 0.269135 0.2707125 3288 0.33215 0.3305506 9795
HR 0.281790 0.2712610 3506 0.35797 0.3560612 6935
NA 0.219450 0.2265969 84984 0.18224 0.1959623 29059

As one looks closer at the Estimated Returns, one finds that borrowers with lower credit grades have the potential to give higher returns in normal prosperous economic times. On the other hand, in bad economic times they also have the potential to yield significantly lower to negative returns. It is very important for investors to truly understand how difficult it is to get their money back at all if someones loan is charged-off.

I wanted to better understand the shear magnitude of loses from charge-off loans since I am guessing many investors who use prosper for loans don’t really understand the huge risks of specific loans. If you add up all losses investors made from charge-off loans you find investors lost $20,313,392 including lining Prospers pockets with $1,159,688. I would make sure to think twice before giving someone a loan who has a HR credit grade especially since Prosper loans still makes millions if you lose your life savings.

Charge_off_losses Prosper_Profits
20313392 1159688

When a investor loans money to a friend, although counter intuitive, it seems they were more likely to invest more with a friend with a very lower credit grade of HR. Could this be since they feel sorry for their friend and loan them money purely since they think no one else would logically lend them money? I would love to be able to ask these people why they are loaning there friend with an HR rating this money.

Reflections

I wished I could have had been given more background on each of the data set’s variables. The vagueness of the data set’s variables made calculating the overall payout of all loans for investors on Prosper very difficult.
For example, listing number 150265 had a Net Principal Loss of $7603.16 on a $20,000 loan but had non-principal recovery payments amounting to $21,117.90. How is this possible?

I would also have loved if the dataset had more data on the investors themselves. How much money do these investors make a year and what background do they have in investing? Such data could give me a better understanding on whether someone with no investing experience can make money investing on prosper loans or is more likely to lose large quantities of money.

Finally, I wish I had a way to compare prosper loan statistics to a normal banks loan statistics. I would love to understand how the lending practices differed between the two and if these differences changed the overall loan returns on average. Do banks refuse to lend out money to specific borrowers while prosper allows such loans since they still make money from facilitating extremely risky loans?